A First and Second Law for Nonequilibrium 
Thermodynamics: Maximum Entropy Derivation of the 
Fluctuation-Dissipation Theorem and Entropy 
Production Functionals 

David M. Rogers and Susan B. Rempe 
January 26, 2013 

Abstract 

We derive a physically motivated theory for non-equilibrium systems from a max- 
imum entropy approach similar in spirit to the equilibrium theory given by Gibbs. 
Requiring Hamilton's principle of stationary action to be satisfied on average during a 
trajectory, we derive constraints on the transition probability distribution which lead to 
a path probability of the Onsager-Machlup form. Additional constraints derived from 
energy and momentum conservation laws then introduce heat exchange and external 
driving forces into the system, with Lagrange multipliers related to the temperature 
and pressure of an external thermostatic system. The result is a fully time-dependent, 
non-local description of a nonequilibrium ensemble coupled to reservoirs at arbitrary 
thermostatic states. Detailed accounting of the energy exchange and the change in 
information entropy of the central system then provides a description of the entropy 
production which is not dependant on the specification or existence of a steady-state 
or on any definition of thermostatic variables for the central system. These results 
are connected to the literature by showing a method for path re-weighting, creation of 
arbitrary fluctuation theorems, and by providing a simple derivation of Jarzynski rela- 
tions referencing a steady-state. In addition, we identify path free energy and entropy 
(caliber) functionals which generate a first law of nonequilibrium thermodynamics by 
relating changes in the driving forces to changes in path averages. Analogous to the 
Gibbs relations, the variations in the path averages yield fluctuation-dissipation the- 
orems. The thermodynamic entropy production can also be stated in terms of the 
caliber functional, resulting in a simple proof of our microscopic form for the Clausius 
statement. We find that the maximum entropy route provides a clear derivation of 
the path free energy functional, path-integral, Langevin, Brownian, and Fokker-Planck 
statements of nonequilibrium processes. Physical considerations justify a fundamental 
definition of thermodynamic entropy increase as system information entropy plus heat 
exchange with an external thermostatic system. 
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The Green-Kubo fluctuation theorems [T] relate equilibrium time-correlation functions 
with the time-response of system observables to an external driving force. They are im- 
portant for their ability to calculate the transport coefficients appearing in Onsager's phe- 
nomenological relation j2j, 



which identified Xj with the "entropic" driving forces, dS/daj ^ f3(t) — /9eq[3]- 

These two relations form a rough draft for the a first law of nonequilibrium thermody- 
namics. Combined with the usual second law prescription of increasing entropy, the above 
establishes the direction in which the system will relax toward equilibrium and an estimate 
of the entropy increase by this process. However, the linear transport equations have been 
derived by analogy with the equilibrium theory, and their interpretation must be made with 
respect to the entropy of a quasi-steady process. A more satisfactory development would 
therefore determine the range of conditions for which Eq. [T] holds, as well as provide a founda- 
tion for studying processes with arbitrary driving forces and defined without reference to any 
equilibrium state. If, in addition, the theory was able to resemble the well-known equilibrium 
statistical mechanics, it would offer a wealth of immediate insight into new applications for 
which now standard nonequilibrium methods may prove cumbersome and error prone. Such 
a resemblance must be reached by defining path functionals analogous to the energy and 
entropy of equilibrium states - and would therefore constitute a true statistical mechanics 
for thermodynamics (as opposed to thermostatics). Our first question in this investigation 
will be how such a microscopic first law of thermodynamics can be formulated. 

Although much work has been devoted to the above problems, these questions have been 
addressed from a large number of different viewpoints in the last thirty years. Further, there 
appears to be little consensus on a unifying, general, set of relations from which which all 
nonequilibrium results may be derived. [4j Recent work has centered on a second question 
of deriving a microscopic second law of thermodynamics through proving the existence of 
fluctuation theorems [5] and exploring their consequences for nonequilibrium systems. 

A fundamental fluctuation theorem result is an expression for the "lost work" over and 
above the equilibrium free energy for stochastic processes that convert one thermostatic 
state to another [6] via mechanical driving. This lost work can be interpreted as an entropy 
increase. It is simple to show in the case of an isolated deterministic system[7] 



Where the work, W, is the gain in system energy AU = Us{xs) — Uq^xq), and both dis- 
tributions are required to be at the same temperature, ks/f^o, to use the equilibrium re- 
lation /SqAF = — In Zs/Zq.}^ Further work has provided examples of many fluctuation 
theorems. |1[ |9] Each can be used to define a measure of irreversibility, since for any two path 
probabilities, A and B, 






(2) 



^ {I) J, = V[B\A] > 0, 
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where D is the Kullback-Leibler divergence (and necessarily positive). Although special 
significance is often attributed to time reversal (defined by replacing the time evolution 
operator from iio i + 1 with a reversal of odd functions of time at step i + 1, unaltered time 
evolution from x*^^ to x*, and another reversal of odd functions at ^),[TQ] this operation is not 
possible if one-way steps are present, so that only one of i — )■ i+1 or — i* has probability 
zero. Furthermore, the extension of this formalism to systems lacking momentum, such as 
discrete processes described by a transition probability matrix is also unclear (although 
suggestions involving a [possibly non-unique] stationary distribution) have been offered[5j. 
Until these issues are resolved, there does not exist a definitive path functional able to answer 
the first question above. 

Early works were principally focused on the first question. After a detailed picture of 
how time-correlations control the rate of relaxation to equilibrium |1 11 [T2] . a general set 
of relations (projector-operator theory [TBI El 1151 US]) was described from which all such 
macroscopic relaxations may be derived. The principal content of the theory was to define 
a coarsening, projection, operator which removes information about the unmodeled degrees 
of freedom. The exact, non-Markovian kinetic equation for the probability distribution in 
the space of remaining variables then shows the relaxation process in the form of a time 
convolution of the time-correlation functions and the thermodynamic forces driving them to 
equilibrium. [131 HI] That is, Eq. [1] should be replaced with 



where Lij{s) = {ai{t)aj{s)) . Despite conjectured relationships to a nonequihbrium entropy 
function, [171 [iHl [16] its relation to the thermostatic entropy has not yet been fully justified in 
terms of the maximum entropy formalism or clarified to the point where it is possible to derive 
the above from a second derivative [1 9 j - analogous to the Gibbs relations for equilibrium 
which give rise to obviously related quantities such as heat capacity, compressibility and 
coefficient of thermal expansion. 

At this point, the situation shared a peculiar similarity to the circumstances surrounding 
Gibb's classical text introducing the principle of maximum entropy [20] • The method for 
deriving fluctuation dissipation theorems (analogous to the first law) was to define a system 
evolving according to an exact Lagrangian, make a random phase approximation to yield 
an ensemble of exactly evolving trajectories, and then derive a corresponding "physical" 
distribution on trajectory space (analogous to phase-space, Tbl. ^. Because of the prevail- 
ing attitude regarding mechanics as the only possible method for solving such problems at 
the time, Gibbs use of maximum entropy methods was seen as a non-physical trick[2T] to 
derive properties of molecular equilibrium. Again, introducing maximum entropy lead to an 
expansive generalization of the fluctuation dissipation theorem by Jaynes'[T9l [22] - who, it 
should be noted, has recognized and written about the above conflict of Gibbs [25]. 

Jaynes introduced maximum entropy following the program of Gibbs and using the sub- 
jective interpretation of probability used by Laplace and Jeffreys. [23] Making the substitu- 
tion from a state (containing a variable at a single time) to an entire trajectory immediately 
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identifies a non-equilibrium analog of the Gibbs ensemble, complete with entropy and free 
energy functionals on trajectory space. Pursuing the analogy further, Jaynes showed that 
the first derivatives of the path free energy yielded averages of path functionals, and the 
second derivatives (minus the space and time-correlation functions) give their "first-order" 
perturbation with respect to changing the thermodynamic forces, A[23t [T9| [25] . Therefore 
we can shorten Eq. [3] to 

(«) ^ («)o - E - ^^^o)ibk - (&.)o))o(^^ - ^°)' (4) 

k 

for any path functional, a, we wish to average and any set of "controlled" quantities 
which define our trajectory ensemble. If, for example, bk included time and space-dependent 
particle fluxes (i.e. k indexes both time and space so that ~^ ff dtdx), then Eq. H] easily 
generalizes Eq. [1] to fluctuation relations defined directly from the set of constrained path 
functionals, (6^). Notice that the perturbation expansion above is not defined by reference to 
an equilibrium state, but instead with respect to a reference probability distribution on path 
space. Examples of such elegant derivations of fluctuation theorems have been given many 
times in works employing path integral methods |26] such as those related to the Onsager- 
Machlup action |27] . Further work[TO| [28| 129] has also shown how Jaynes' path entropy 
may be connected to fluctuation and entropy production theorems. 

However, Eq. H] contains a fatal flaw. To see this, flrst note that the average of a quantity 
at time t is dependent on the "control" parameters throughout the whole trajectory. This is 
because logical inference does not contain a preferential time direction, and knowledge that 
the system has a given property at time t + t constrains the state at time t. Although this 
could be alleviated by requiring only casual information to enter into the determination of 
the state at time t, this approach leads to a probability distribution valid only for Xt and 
not any previous times. A better approach is to maximize the entropy of the transition 
probability distribution. 

Limiting the scope of the maximum trajectory entropy procedure in this way automat- 
ically corrects a related re-normalization problem. Suppose phase space were to branch at 
a future time t + t. In this case, a uniform measure on path space would assign points at 
time t a different weight depending on future events. A simple example is the Monty Hall 
problem with a prize assigned to the flrst door without loss of generality. [SU] The contestant's 
choice of one of three doors plus Monty opening another (not concealing a prize) constitutes 
four possible paths, and maximum path entropy would give each path an equal weight - 
an intuitive, but incorrect, solution. The correct probability assignment is a uniform distri- 
bution for each transition, leading to ending weights of 1/3 x 1/2 = 1/6 for the two paths 
following the selection of the flrst dooi0. In more abstract terms, the marginal probability at 
each time should not depend on the future - a concept expressed mathematically by defln- 

^We leave updating the player's state of knowledge about the prize out of the discussion, and reiterate that 
the full path probabilities are 1/6, 1/6, (contestant's correct choice plus Monty choosing randomly from the 
remaining doors), and 1/3, 1/3 (contestant's two incorrect choices plus Monty's forced choice of an incorrect 
door). 
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ing a progressively measurable function with respect to the natural filtration of stochastic 
pro cesses [5T]. 

In practice, this fiaw can be avoided by considering only processes where assigning equal 
a priori weight to all paths is equivalent to assigning equal weight to all transitions. By 
Liouville's theorem, this is obviously true for deterministic processes. More generally, the 
approaches are equivalent when the number of possible transitions does not depend on the 
starting point. The path entropy approach of Jaynes is therefore valid in the absence of 
factors re-normalizing for starting-point dependent differences in the number of possible 
paths {Z[Xi,Xi] = Z[Xi] in Eq. [8]). 

A further question remains on the application of fiuctuation-dissipation theorems to both 
Langevin and Brownian (overdamped Langevin) processes. What is the appropriate order 
for coarse-grained equations of motion? Writing down a Langevin equation assumes Newto- 
nian, second-order dynamics, whereas the first-order Brownian motion can also be derived 
via the same approach. Although the Green-Kubo relations were supposed to have solved 
these problems, ambiguity remains from this approach at a fundamental level because a 
strict derivation using the method of mechanics does not offer direct insight into the choice 
of macroscopic variables used for systematic coarse-graining. In fluid mechanics, either can 
be applied, and the choice between full Newtonian motion models or simplifled advection- 
diffusion models is predictably based on the scales of length, relaxation time, and applied 
force involved. Several authors have made substantial contributions [32], [331 EH [35] in con- 
necting fluid mechanics and non-equilibrium dynamics. 

Although seemingly incompatible, there are advantages to both the mechanics-based and 
the statistical derivations. For example, considering the choice of equations of motion from 
the statistical perspective of Equation [H at once allows us to see the consequences of each 
choice. If we choose a flrst-order equation of motion, then we substitute the velocity at a 
point for a and expand it about a streaming velocity {a)^ in the particle fluxes, bj{x,t). 
If second order, then a becomes a change in momentum at a point and we expand about 
the average force, (a)g, in the stress tensor of the surrounding fluid, b{x,t).We believe that 
further corrections sometimes employed in fluid mechanics may also be derived via extending 
the process that lead to Eq. [H If both the nonequilibrium flrst and second laws could be 
addressed from the same perspective, it seems that an expansive generalization of non- 
equilibrium statistical mechanics could be achieved. 

In order to combine these two viewpoints, we introduce the following maximum entropy 
argument. Suppose the information about the time change of a set of dynamic variables, x, 
consists in a set of prescribed averages, {(/^(xi+i, Xj)|a;j)}^^. Using only this information, 
we are to construct a probability distribution for the dynamic variables at point i + 1, given a 
known x at time i. According to the standard maximum information entropy (H) machinery. 
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the answer is 

p{Xi-^-i \Xi 



In 



(5) 



- {\nZ[xi] - 1) - ^ Afc,i/fc(xi+i, Xi) 

k=l 

p{xi+l\x^) = p^{xi+i\xi)e-'^'^''^ / Z[Xi,Xi] (6) 

m 

Vi N = ^ h,ifk [xi+i , Xi] (7) 



k=l 

Z[X,x,]= J2 /(a;i+i|a:,)e-'''[^l (8) 
Tii+iii =liaZ[X,Xi] + {r]i\xi) (9) 

m 

dKi+iii = ^ Xkd{fk\xi) (10) 
fc=i 

m' 

= ^ Afc(/fc|xi) - "Hi+iii (11) 
fc=l 

m' m 

d-^i+iii = J2 {fk\xi)dXj - Xkdifk). (12) 

k=l k=m'+l 

These are the expressions relating to the statistical state at time i + 1 given information on 
the transition probability distribution. The functional notation for quantities such as r]i[x] 
has been used to indicate that in general, these may be considered as functional depending 
on the trajectory over all times before i + 1. In the continuous limit, the above quantities 
exist between times i and i + 1, and should be viewed in the Stratonovich definition. The 
appendix shows example calculations of the partition function for the Wiener process. 

The essential difference between Eq. [6] and the maximum path entropy prescription is 
that in the latter, the path probability is 

p-A[{x,}l]-f}U{xo) 

V ({x.ia = ^ Vo {{x.Vo) , 

for some path functional. A, and constant, Z, while this is replaced by 



i=0 



in the former. The present conditional expansion implies a dual characterization of a stochas- 
tic process as a single ensemble of paths and a set of telescoping ensembles, each with well- 
defined, non-anticipating energy changes through AF^ = X]/=o {fk[xi+i-,Xi\)- 
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Equilibrium 



Non-Equilibrium 



Phase Space 
Free Energy 
Entropy 

Average Value Constraint 
Equilibrium Average 
Conditional Free Energy (PMF) 
Thermodynamic forces (5PMF) 
Heat Capacity 
(none) 



Trajectory Space 
Path Free Energy 
Caliber 

Average Flux Constraint 
Path Average 

Conditional Path log-Probability 
Changes in Path Flux (work) 
Thermal Conductivity 
Irreversibility and Entropy Production 



Table 1: Correspondence between single-time and time-dependent path maximum entropy 
formulations of statistical mechanics. 

Using the above, maximum transition entropy, form has several distinct advantages for 
the derivation of non-equilibrium relations. Not least is the correspondence to the canonical, 
maximum entropy form of equilibrium thermostatics pointed out in Table „ We begin by 
deriving the generalized Langevin and Brownian dynamics from a consideration of the action 
deviation as a constrained quantity. In Sec. [21 we connect the Langevin equations derived 
from our stochastic action deviation principle to energy exchange and thermodynamic en- 
tropy production via interaction with external reservoirs. |36j Following Jaynes' information 
theoretic derivation of statistical mechanics, we identify a quantity analogous to a free en- 
ergy and entropy functionals for maximum entropy transition probability processes in Sec. [31 
The coefficient of thermal expansion, isothermal compressibility, and heat capacity can be 
derived using second derivatives of the equilibrium free energy. In an analogous way, we 
show how Green-Kubo transport theory can be derived from second derivatives of Legendre 
transforms of the maximum transition entropy functional. These relations can describe the 
response of both steady and non-steady states arbitrarily far from equilibrium. Applications 
of this result enable calculation of derivatives of the current-voltage curve at constant cur- 
rent or constant voltage and give general conditions under which Onsager reciprocity will 
hold. We then connect these functionals to the thermodynamic entropy production and 
Crooks- Jarzynski fluctuation relations. 
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1 Dynamic Constraints 



Starting from a mechanics problem specified by Lagrangian, L{x,x), the usual mechanical 
prescription is to require stationary action. 

A= L{x,x,t)dt 
Jo 



oA= / —-ox + —rOxdt+L\r, 
Jq ox ox 



ox 



s 




+ L\U 










dL d dL 
dx dt dx 



6x{t) dt 



6A dL ddL 

F — p (13) 









V 
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V 



6x{t) dx dt dx 

In this report, we will freely substitute force, F = and momentum, p = The 
requirement for stationary action then reads =^'^* Q ^ F = p. 

When L = x'^Mx/2 — U{x), the above procedure directly gives Newtonian mechanics and 
has the advantage of being generally valid under coordinate transformations, y = y{x),x = 
x{y). However Eq. [13] gives second-order equations of motion, requiring x = dx{t)/dt by 
definition, and complicating discussions of numerical integration. A first-order form can be 
derived from an alternate Lagrangian, 

) = {q-lvfMv-U{q), (14) 

by treating q{t) and v(t) as separate dynamic variables. Using x{t) = {q{t), v(t)} in f[T51) . we 
find 

^A 

£ = M{q-v). (16) 

Setting these two equal to zero gives a result entirely equivalent to Newtonian mechanics, 
but in which we may consider a Verlet-type integration process of updating v with fixed q, 
and then updating q with v fixed. It has been found that Eq. [13] forms a solid basis for 
forming generalizations of physical laws. 

A stochastic generalization may be to permit small deviations by constraining / = 
Sx^t) 5x (t^At) ^ ®- '^^^ resulting matrices of constraint values (Lagrange multipliers 
G^t,^t G [0, oo)) can restrict deviations in the action in a history-dependent and non- 
local way. Each deviation in the action can be thought of as arising from elastic collisions 
with un-modeled molecules from the surrounding 'bath' environment or as an unknown La- 
grangian applied to the system between times t — e and t[37] (with e a small time increment). 
Without any other constraints, this squared-deviation constraint implies interaction with a 
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completely chaotic (infinite temperature) bath and does not conserve energy, momentum, 
etc. except in the deterministic limit (G — > Joo). 

When two systems are coupled, the combined system should obey a set of conservation 
laws. As formalized by Noether's theorem, [38] such conservation laws can be derived for a 
single system directly from the action formulation by considering continuous transformations 
of the trajectory x{t) — )■ x{t) + q{t, a) in a region around a = - where q{t, 0) = 0. Because 
the action is stationary with respect to small perturbations in x{t), there exists a vanishing 
quantity. 



dl{t) ^_dA 
dt da 



a=Q 

dq{t, aY 



da 



(IT) 



a=0 



5x{ty 



(with T reserved for denoting transposition) and it is possible to define an invariant (using 



dg{t,a) 
da 



a=Q 

t 



m-lioo)= I -y^it')^dt' 

J —CO 



Feynman showed that if the action functional is invariant to this transformation (^[a;(t)] = 
A[x{t) + q{t, a)]) then the corresponding invariant is a conventional conserved quantity - e.g. 
x{t) — )■ x{t) + a {y = 1, the ones-vector) generates the momentum, x{t) — )■ x{t + a) {y = x) 
generates the energy, etc. 

It is instructive to consider a case where the action is not invariant to the transformation. 
For example, the form of the action is changed using the substitution for momentum in the 
simple harmonic oscillator (L = mx'^/2 — kx'^/2 ^ mx'^/2 — k{x + a)^/2). However, if we 
assume the existence of a generalized momentum, p that is nonetheless an invariant of some 
'complete' system, we can define (using [T7|) 

= p — r = mx + Kx 

dt 

as a momentum exchange. If the net momentum of the observed system changes by more 
than its internal force, it implies that an external system has lost exactly this amount of 
momentum. In the presence of noise, this quantity may not be conserved {dp is stochastic) 
- so that one effect of the noise is to re-distribute p over the system and the bath. 

In general, we can define a set of constraints on {{dl j{t) / dt)}'^^^ using an x m + 1 

matrix F, whose columns correspond to the constraint directions ^'^^a^f^ and whose 

a=0 

leftmost column is reserved for the energy Y^^q = x. In this case, the vector of exchanges for 
a given trajectory over a given time interval, e, is 

dm ^ -Yitfi^^e. (18) 

In this equation, the presence of e is used to implicitly denote the Stratonovich integral (see 
Appendix) . 
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These considerations have shown a simple method for including the influence of an exter- 
nal system on the dynamics. If we assume the existence of some "total" invariant between 
the system and the bath, then it makes sense to enforce a stochastic constraint on the aver- 
age change {dl{t)). We should note our fixed sign convention, where I is always taken to be 
a quantity belonging to the central system under consideration. Before moving on to discuss 
the obvious connection of these changes to the thermodynamic work, we shall first consider 
the significance of the Lagrange multipliers in this formalism. 

Collecting constraints on ^f^^f^e? dl and carrying out the maximum entropy procedure 

specified in Eq. |5]for determining for each time, t, given a history x{t') for t' < t — e, e — )■ 
0+, 



exp 



— e 



5A '^qJ^ _ SA 



xZ[/3,x(t')r=a 

' SA 



5x{t) 5x{t) 5x{t) 



t-e 



^ M _|_ ^ SA 
'i-^Sxir)"'' ''6x{t) 



Y{t)P/2 



exp 



G. 



5x(t) 

/\t-e 1-1 
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xZ[P,x{t')l7J,] 
C = i2Gy^ 



^(t) ^ C { Y{t)Pe/2 - e / Gt^^^^dr ] . 



t-e 



(19) 



In this equation, Y 13 is a vector with the dimension of the system coordinates, since /3 = 
[/3o, . . . , /3m]^- The action deviation, ^jf^e, then follows a Normal distribution with mean 
and single-time variance/covariance matrix Ce. The history integral in the above equation 
could alternatively have been written in terms of a time-dependent covariance function. 
In this work, however, we will not be concerned with the calculation of history-dependent 
partition functionals for which this transformation becomes useful. 

5 A ^' 



Enforcing a constraint for the Hamiltonian energy change (dH) 



5x(t) 



xe using the 



Lagrange multiplier /5o/2 (along with a possibly empty set of additional constraints in the 
form of Eq. [T8|) . leads directly to a Generalized Langevin equation 

Sx{t) " 



t-e 



pe = F(x(t))e - C (^x/3oe/2 + F^/3e/2 + e j Gt-rip{r) - F{T))dTj + C'/^dWit). 

(20) 

Here the centered (Stratonovich) Wiener process increment has been substituted for the 
standard normal random variate, z{t), at time t using dyV{t) = e^/^z(t). It is well-known 
from the Fokker-Planck equation |39j 



dp{x,p) 
dt 



iM~'pfV,p - Vl [Fp - Gxf3op/2 - GVpp/2] 
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that the solution to this equation (in the memory-free case, but see also Ref.[l0]) is the 
canonical distribution with temperature 13^^ = ksT (where is the Boltzmann constant). 
This solution is independent of G, suggesting a natural parametrization for the Langevin 
equation is in terms of the temperature and C^/^, related to the thermal conductivity or rate 
of temperature equilibration (see Eq. |2T|) . Generally, if an invariant, /, can be expressed 
as a function of a;,p, then a result similar to the above should hold for other common 
equilibrium thermodynamic ensembles as well, such as the iV, P, T ensemble where {dV) 
constitutes an additional dynamic variable and constraint. This establishes the physical 
interpretation of the Lagrange multipliers as the thermostatic variables of the bath that 
dictate the eventual equilibrium of the system. Note that increasing (3 tends to decrease 
((iJ), for example increasing pressure will drive the volume downward. 

The present work shares some conceptual similarity to the second entropy of Attard ^21 
115] . In this report, however, we have been able to derive our results in a mathematically 
rigorous way directly from two extremum principles, a maximum entropy expression for the 
transition probability (Eq. [6]), and constraints derived from an action functional (Eq. [T3|) . 
This allows for trivial generalizations to systems coupled with arbitrary reservoirs. In addi- 
tion, there is a clear physical motivation for the transition entropy and the full nonequilibrium 
entropy production which allows us to find the work done on the system by each constraint 
- as will be shown in the next section. 

The above set of equations is also sufficient for defining nonequilibrium analogues of 
intensive thermodynamic variables such as the temperature. This can be done by adding a 
hypothetical constraint, {dij), defined for some set of atoms or region of space in the system. 
Analogous to the operation of a thermometer (zero energy exchange on imposing stochastic 
and damping terms) to define temperature, we then require that no work is done on average, 
(d/o) = 0. Integrating using the Stratonovich rules developed in the appendix, we find 



The result is the intuitive kinetic temperature, and is especially simple if we choose C = 
2M'y/Po as is common for the Langevin equation. In that case, the ensemble average kinetic 
energy at each instant determines the temperature. For Boltzmann-distributed x, regardless 
of the choice of C, the average heat flow is zero when /3o determines the temperature. If 
different types of particles can be coupled to separate thermostats, such as in plasmas, then 
it becomes physically meaningful to speak of separate ionic and electronic temperatures. 

To end this section, we show that it is possible to derive a Brownian limit from our action 
functional approach using the alternative Lagrangian, Eqns. [TUfTGl First, it can be shown 
that changes in the Hamiltonian are recovered by applying Eq. [TS] 




(21) 




— V 



^ 5v{t) 



q^{— + Mv) - v'^Miq - v) 
oq 

^\U{q)+v^Mv/2\ 
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Next, applying Eq. [19] and assuming = exactly, we find a combined equation 



Mve = F{q{t))e 

qe = ve + C, - ' G,,t-r{q{r) - v{r))dr^ + C'J^dWit) (22) 

The second equation has the form of a Generalized Brownian motion equation, but includes 
terms related to the process v{t). In particular, if the process v{t) becomes unknown, then the 
best guess form Mv = F{x), and a streaming velocity v = vo generate an appealing equation 
for Brownian motion. More rigorously, if the process v{t) is assumed to be unknown, updates 
q should be made based on a stochastic realization of v{t) whose average will generate the 
streaming velocity {v(t)) = Vq. 

To the best of the author's knowledge, this is a novel derivation of the Brownian limit 
that does not require an explicit limiting process of a infinitely massive particle, or infinite 
momentum jumps between position changes. Instead, these two assumptions are implicitly 
present in assuming that v, Mv are known during each position update. The memory term 
derived here is similar to the form postulated in Ref. jS], which lead to a quantitative 
treatment of memory effects. Here we can see it to be a natural consequence of placing 
constraints on the squared deviation of the action and the energy change at each time-step. 



2 Irreversible Thermodynamics 

Having firmly established the connection to the equilibrium distribution above, we can con- 
struct a high-level view of any process employing a series of transition probabilities to effect 
a change in the state of the system. This construction will lead naturally to a view of the 
process in terms of a thermodynamic path transforming one type of energy into another with 
a concomitant irreversible entropy production. 

To begin, we exactly define a system state. A, as any information which is known about 
a system that is sufficient to construct a probability distribution for its variables, V{x\A). 
The machinery of statistical mechanics can then be used to propagate this information to 
system states at other time-points and under alternate possible processes. 

The work of Joule and Thomson showed that there exists a series of mechanical opera- 
tions that can be performed to effect a transition between any two thermostatic states,^ 
however this transition can only take place in the direction of increasing entropy. The en- 
tropy increase comes about because of experimental inability to control the detailed motions 
of all particles, and is therefore zero in the case of completely controllable mechanical work. 
Thus, it is important to define a mechanical, adiabatic process, in which all work is com- 
pletely controlled by letting C — )■ with constant external force experienced by the system 
^elt,j = ^^VjPj ~^ 9j ft) -^cxt,j(^)- This is formally a zero-temperature, continuous-time limit, 
since at at a finite temperature there is some amount of uncertainty about the exact state of 
/ on short timescales, which leads to a discrepancy between the force exerted by the external 
system and its "long-timescale" counterpart experienced by the system. In general, only a 
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subset of work values can be controlled, and before proceeding it will be necessary to solidify 
the concept of controllable work. 

If this work is to be delivered by an external thermostatic system, for example an adi- 
abatically coupled piston, then the first law of thermostatics gives Fextj = —dUext/dlj = 
dUext/dlresj = —Pj/Po if the force can be assumed constant over a sufficiently short time- 
step. Mechanically, this force corresponds to the force on a wall exerted by a spring placed 
externally to it. The total force on the wall is, of course, 

_ dUirA + dUcxt ^ p p 

-Ttotj — — -f'intj + -^extj- 



Which implies that if a two spring system were disconnected after a change dl = [dJi, . . . ,dl, 
their internal potential energies would have changed by an amount 



ml 



dUiat — —dl Fint 

dU,xt = -dPF^xt = dFp/Po (23) 

The sum of these two energy changes is not necessarily zero due to the possibility of mo- 
mentum change. Using the known energy change of the system, it should then be possible 
to solve for the change in kinetic energy of the constraint. For the system, the total energy 
change is given by 

dE = -^-£xe = dio 
= dW + dQ 

dW = dFa (24) 



T 



dQ = -[x-Yd'i (25) 

where we have used Eq. [17] for multiplied by da/dt, to define work values and a corre- 
sponding "non-mechanical" energy transfer, dQ. In the adiabatic limit, all energy transferred 
to the system by external forces should be refiected by known mechanical changes (related to 
{/}) - which is precisely what d allows us to do. Note also that unless specifically denoted 
'ext' all quantities refer to the central system, so that dW means the work done on the 
system. 

In the mechanical limit, — ^jf^ = P — -^mt = -^Txt , the external force experienced by the 
system in the absence of the thermostatting random noise and Y^,fi = x. Substituting this 
quantity from the Langevin equation (1201) . an adiabatic, mechanical system must satisfy 

dQ = -{x-YaY'-^e 

= -]-{x-YafCYj3t = Q. (26) 



13 



The last section gave some physical insight into the quantities a. A mathematical consider- 
ation of the previous equation shows that Yd = ^JLi VjCij (Y being identical to Y with the 
first column removed) can be understood as a projection, removing components of x parallel 
to SA/6x. If we therefore define (writing the Moore- Penrose pseudoinverse of A as A'^) 

d = {C-^/^Y)+C~^/^x, (27) 

then dQ = identically. For an example application, y = 1 (ones-vector) is associated with 
the system momentum, and Eq. [27] generates the average velocity d = I'^x/N when C = cl. 
Eq. [27] is invariant to multiplication of C by a constant, and so persists in the deterministic 
limit. The work done on the system (Eq. [2^ is 

dW = -^V(C-i/2f)+C'-i/2i;e. 
ox 

Note the information-theoretic quality of the work defined by the above equation. If 
separate reservoirs existed that were able to independently influence the motion of each 
particle in the system, then Y would become an identity matrix, and dW would equal dE 
identically. In the presence of noise, the work done does not necessarily equal the energy 
change. In the y = 1 example above, the work is 

^($^Pa-i^int,a)(5^ia)e. (28) 

a a 

Here "^^j" denotes a sum over (one-dimensional) atoms, with obvious extension to mul- 
tiple dimensions. Because this interaction controls only the total system momentum, the 
work is computed using the average velocity change. As shown in the appendix, the expec- 
tation of this stochastic integral is {dWp)/e = (3o/2x'^C{xa + A//3o)- Similarly, an electric 
field can couple only to the net dipole moment of a system. This implies the transformation 
of applied energy to heat if it cannot be manipulated to perfectly match fluctuations in the 
driven variable. 

The kinetic energy change of J, ascribed to the reservoir, can be determined in the 
mechanical limit from 

= dU^^t + dK-dW 
^dK = -dl'^0/(3o - d), (29) 

using Eqns. [23] and [2^. The above equations thus completely describe any exchange of 
mechanical energy between deterministic systems exerting known forces. If the two-spring 
system considered above were disconnected at time S, an outside observer absorbing the 
kinetic energy of the wall (Eq. [2U]) . the energy change of the reservoir would reduce to the 
usual thermostatic potential change dU^yitit). 

A well known consequence of Liouville's theorem is that the entropy change is zero in 
a completely deterministic process. [ISl [IB] Using adiabatic processes, then, it is possible to 

^Note the similarity of this form to J p — F dt. 
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propagate a starting state, A, to any state with constant entropy. If, however, phase space 
volume were not preserved, it would amount to a discarding of information on the state 
of the system at a given time (for example, by integrating the probability over short time 
intervals) . Then the amount of work that can be recovered from the system will become less 
than the amount input. In an extreme case, all information about the system may have been 
lost, flushing the corresponding information content to zero. Starting from this unknown 
state, p", the probability of a given frequency distribution, p is approached by P{p) oc e^^P\ 
with Tilp] the familiar information entropy functional 

np] = - I pMpi/pi- (30) 



Further, if exchanges of conserved quantities during some process A B are known, then 
the set I{B) are also known from A, and this information can be usefully employed to 
increase the amount of work that can be recovered - showing the entropy as a measure of 
"lost work" . It therefore stands to reason that any adiabatic process should be described by 
not only the above mechanical work values, but also the change in information entropy due 
to information loss. 

Next, consider allowing heat exchange in addition to controllable work. Assuming an 
infinitely large reservoir (or a short enough time-step), added heat will cause a negligible 
change in reservoir temperature. Because we have assumed the work done on each reservoir 
f l23|l can be reversibly stored, these are not associated with an entropy change. We therefore 
introduce the physical entropy change in the reservoir as due only to exchange of heat, or 
"non-work" energy, (iS'ext = l^odQext, with (3^-^ = fc^Text, and dQe^t originating from heat 
removed from the system plus a kinetic energy, dK, assumed to be recoverable only as heat. 
In this work, different notations are used for the information entropy, "H, the physical entropy, 
5", the caliber, a, and the caliber-like functional, a*. All of these quantities are defined to 
be unitless and have some relation to the information entropy of Eq. |30l 

dQcxt = —dQ + dK = —dE — dUext 

= i^^^^ - dfp/P, = -dl^P/Po. (31) 

The energy rejected to the reservoir as heat can alternately be understood as the total 
energy dumped to the environment minus the energy removed "reversibly" (dUext)- This 
interpretation shows that if some of the changes in the environment were re-classified as 
irreversibly stored, so that the information Pjdlj becomes lost, then this is equivalent to 
adding that energy to the total dQext- The total energy rejected to the environment is 
recoverable (the mechanical limit considered above) if and only if dQext = 0, implying dE = 
dlo = —dP (5/ Pq. These considerations again highlight the subjective nature inherent in the 
definition of irreversibility. 

Connecting back to the usual thermostatics, Uext plays the role of an energy for the 
reservoir. 



dUcxt = dEext — /3o ^dS, 



ext 



di^P/Po 
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Where we must provide an experimental justification for the abihty to use or store the energy 
terms appearing in the above sum. 

For any transformation, A ^ B, the total entropy change deriving from information loss 
and interactions with the environment is given by the change in information entropy plus 
the heat exchange term 

A^inf,tot = n[P{x\B)] - n[P{x\A)] + /3oQcxt. (32) 

It should be noted that this formula is still not complete if there is a change in the phase 
space between A and B, for example if particles are added/removed, or if the state space 
is uniformly dilated. In this case, we have extra information on the region of phase space 
occupied after a transition. In general, if the state at time i is known to be Xi, then the size 
of the region of configuration space accessible at time i + 1 is Zi{f3, xi) (Eq. [8]). This reduces 
the entropy at i + 1 to 5*^+1 — lnZj[/3,Xi]. Writing this down for each transition. 




dS, ^ - In _ dl^p. (33) 

The above equation has been symmetrized by including the corresponding entropy decrease 
for i given that Xi was inferred from Xj+i, using the forward step probability but with reversed 
forces, —(3. Support for this form is given in the next section, where it is also shown that 
AS'tot > using the Gibbs inequality. 

Nevertheless, Eq. |22] can already be applied to Langevin and Brownian motion with 
uniform diffusion constants. As another example, applying Eq. [32] to a process taking any 
point to the equilibrium distribution shows that the expected entropy change for this process 
is the usual system entropy difference plus /3o<5ext = —f^o{{E\B) — {E\A)) — = {I{A) — 

I{B))^ p. Because the end-point entropies are fixed, alternate processes for transforming 
A ^ B are restricted to varying AS'ext = / f^odQext- For such processes, we may write a 
strong form for the Clausius form of the second law 

A^tot = AH + y MQct > 0. (34) 

Here, ATI is a function of the end-points, and /3o and dQ^xt are fully variable along the 
path. Choosing a path from a fully specified distribution. A, to a maximum entropy distri- 
bution, I (A), and then to the ending maximum entropy distribution I{B), we may employ 
a quasistatic, "reversible," path between the two maximum entropy distributions, so that 
min J (3odQext{A — )■ 1(B)) = min J f3QdQext{A — )■ I{A)). The heat evolved in this best-case 
process has its origin in the re-classification of information that occurs during the coarsening 
of A, in accordance with conclusions on Maxwell's demon. [17] 

The equilibrium theory is therefore contained in the present development in the form of 
slow, quickly relaxing processes. This perspective shows the intimate connection between 
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coarse-graining that assumes infinitely fast relaxation of the reservoir and the traditional 
theory of quasi-static processes. However, the complete theory also permits an investigation 
of both relaxation processes and entropy production in time and history-dependent processes 
moved arbitrarily far from equilibrium by coupling to simple thermostatic reservoir systems. 

3 Predictive Statistical Thermodynamics 

As noted in the introduction, there is a fundamental difference between applying maximum 
entropy to a complete trajectory and to each transition probability distribution individually. 
Referring to Eqns. |6]and[8l the probability for a path F = {xi}^ (using a starting distribution 
V (xo) on xq) is 

ViT) 



z[{X},{xyt'] 
Vo{r\xo) 

Defining a path information entropy, or caliber, associated with the number of possible ways 
that a given V (F) could be observed, we have 

= (r,) + {lnZ)+W„. (36) 

This equation shows that the path entropy is an average of maximum entropy increments 
given by Eq. [91 This form telescopes in the following way. Suppose only the process from 
time to j < S* is of interest. In this case, the above entropy expression can be separated 
from 

(^r = J2\Yl ^kMxi+i, {x}i) \ + (In Z[Xi, {x}^]) + ar 



5-1 



n^Kiiwo)^M 

P~'?[r;{A}] 



1=0 

^o(r|xo 



Z[{A},M 



5-1 



^^\k,ifk{Xi+i,{x]l) 
j=0 k 
5-1 

\{z[K{xy,] 

1=0 
5-1 

n^Kiiwo)- 
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with the non-anticipating partial sum 



^r' = ( ^k,ifk{x^+l, {x}i) ) + (In Z[Xi, {xYol) + Sq. 

i=0 \ k I 

For times between < j < 5, the first hne above has exactly the same form as Eq. [35], with 
I-Lq replaced by cxr'. That is, the starting distribution has become a multi-step probability 
distribution with no change in the total caliber. 



3.1 Path Averages and Fluctuations 

Analogous to the equilibrium theory, we should expect that a cumulant expansion of an 
appropriate path free energy will yield path averages, fluctuations, etc. From Eq. [251 a path 
free energy functional can be defined as 



^[A] = -(InZ) = (r/) -ar + Uo- (36) 



Expanding 

dF d 



I (^-EinZ[A,;r]jp(r) dV 



{fk{xi+u {xjo)) + {fk{xi+u {x}o) - {fk{xi+i, {xYo))) 
{fk{Xi+l, {xYq)) . 



we see that the derivatives of J-" indeed give the path averages, (/). 

This result is valid for any chosen set of A and corresponding functions, fk- It is therefore 
possible to formally use the above to compute arbitrary path expectations, even if they do 
not affect the dynamics (A = 0). We can also use the above equation to formulate a first law 
of time- dependent, nonequilibrium thermodynamics 

s-i 

dJ'mt'] = $^$^(/fc(a:m,{a:ro)|A>c^A,,. (37) 

i=0 k 

We can similarly compute second derivatives to give 

"oT W\ {fk,i){fl,j) {fk,ifl,j) 

^ - Gov [/fe(x,+i, {xYo), fiixj+u {xYo)] . (38) 

Using the Legendre transformations (J-" — A/i^) given in Eq.ns [TT|[T2| these fluctuations 
can be transformed to ensembles with constrained averages (acceleration, particle flux, etc.), 
(/), rather than thermodynamic forces, A. 
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A simple derivation for such equations can be given following Ref. [3]. Performing a 
second-order expansion using Eq. [3S1 and writing the result in matrix form, 



8X28X1 8X28X2 



8^T 
8X18X2 
8^T 





'dX{ 




dX2_ 



d{fh 



we swap the sides of a set the averages d{f)2 and their corresponding forces dX2- 



A B 




dXi 


C D 




dX2 



dXo 



dXi 



d{f), 



Re-assembling the matrices on each side, and inverting 



/ -B 

-D 



d{f); 




'A - BD-^C 


-BD-^' 




dXi 


dX2 




-D-^C 






d{f)2_ 



9-2^ 
'dxJ 



8Xi8(f), 
g2^ 



d{f)2dXi 
T I 
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dim 

dm 





dXi 




d{f)2_ 



{f)ldX, - X^dif),. 



(39) 
(40) 



The above manipulation, well-known in the theory of linear, local equilibrium processes 
are seen as Gibbs relations in the general nonequilibrium theory developed here. These 
equations describe a change of ensemble in the equilibrium theory [5^. In this respect, 
they form a basis for connecting stochastic, Langevin dynamics simulations (e.g. Eq. [20]) 
to constant kinetic energy, solute flux, etc. ensembles studied extensively in nonequilib- 
rium molecular dynamics simulations [35j. In addition, the upper-right matrix element, 
d{f)^ = — Cov[/i, /2]Cov[/2, f2]~^d{f)2 appears as the starting point for the Mori projector- 
operator method (15| as has been discussed by Jaynes. [23] 119] 

Using the action-type conserved quantity constraints (Eq. [T8l) in the free energy func- 
tional, its derivatives generate the fluxes g(^j^^^y2) ~ {dlj{t)). At equilibrium, these fluxes 
are zero. Given information about the previous history of the system, {dl{t — r)), we can 
use Eq. |39] to find the linear change 



dm 



t'<t 



(dm) 



(41) 



[di{t + i)diitf){di{t)diitf) \di{t - 1)) 



-CTi 



(To CTi 


~1 


'{dl{t - 


1))" 


_(Ti ao_ 




{dllt - 


2)). 



two steps 



{dl{t — 1)), one step 
(42) 

(43) 
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This is the hnear equation of motion for a near-equihbrium system under a mechanical driving 
force, and explains the ubiquitous use of Fourier transforms in solving these equations. It 
could be expanded to arbitrary order using a Taylor series in 



a(/3,(t)/2)- 

The case of driving by thermal, or constant external force, conditions is much simpler, 
and we may accordingly treat the more complicated case of a transient process. According to 
Eqns. [37] and [381 we may expand about an arbitrary initial distribution plus some reference 
program, (denoted by the zero subscript), to find the equation of motion for thermal 

driving. 

(./(*)> « m))„ + E 9(pmm/2) ''-^'-'^'''' 
= {di(t%-\Y,{dmdi(tT),m') 

t'<t 

This relation contains a factor of 1/2, as in a derivation by Searles and Evans |50] where it 
was shown to reduce to the Green-Kubo expression in the zero-field limit. As noted in that 
work, the last term on its own is incorrect when a driving force is present. 

The leading term in the above is the flux in the reference process. Choosing this reference 
process as a conducting steady-state shows one mechanism for the failure of the Onsager 
reciprocity relations. The relations = C should hold in any case, but are only derivatives 
of the fiux/force curve. These derivatives are analogous to the fluctuation moments of the 
canonical ensemble, which can approximate the energy at a slightly altered temperature. 
They are expansions about a fully nonlinear free energy functional, J-'. 

3.2 Path Perturbation and Connection to Entropy Production 
Theorems 

The linear relations derived in the last section should not be expected to hold for large 
deviations in the nonequilibrium forces. We can progress beyond this limitation by analogy 
to the transition from thermodynamic integration to free energy perturbation in equilibrium 
free energy calculations. Any two processes on the same path space, {T}, can be compared 
using a likehhood, 

jA-^Bir] = '^B (r) ^ -faB-r)A) ^^[r] 

The likelihood obeys (e'^^sFl^^^ ^ and (by the Gibbs inequality) (U^sir])^ < < 
{Ia^b) B- The distribution of the weight satisfies the perturbation formula [5T| 

c'^Va {L = Ia^b[T]) = J e'^--[ri5(L - 1[T])Va (T) dV 
= j 5{L-1[T])Vb{T) dV 

= VB{L = lA^Bn). 
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It is possible to express path averages using the above quantity as 



(a[r])^ = (a[r]e'--[^]>^. (45) 

However, in the equihbrium theory, a constant related to the free energy difference is usually 
cancelled on the right-hand side of this expression. There is no such constant in the above 
equation because we have not identified an appropriate extensive variable. In the present 
case, we can use the free energy (Eq. [3SD to define 



Va (r) e~-^-4[A] 



in terms of which 



/ A 



and 



>7 



Or course, the above expressions may be of little analytical use because (lnZ[Aj, {a;}Q]) has 
simply been subtracted from 77[r] (in the exponent of e^°^['^''^^^ol. However, the transition 
probabilities usually adopt particularly simple forms, e.g. normal distributions. If, for 
example, the transition probability takes every point to an equilibrium distribution, then 
the above expansion is particularly useful. 

It has been claimed that entropy production can be gauged by the ratio of forward to 
reverse path probabilities. [52l [TO] For the process derived in Sec. [H we can define a reverse 
process by inverting the sign of the generalized forces, Pi, and normalizing the distribution 
separately for each Xj+i. Such a reversal corresponds to an attempt at guessing whether 
energy has been added or subtracted during a step i — )■ i + 1, with the action deviation 
constraint, G, unchanged. Thus, we can define a ratio 

^ V{xi+i\xi,(3)V{xi) 
V {xi\xi+i, -P)V (xi+i) 



V [xj) PQ{xi+i \xi)Z[-f3, Xj+i] 
'P ixi+i)po{xi\xi+i)Z[(3, Xi] 



Since po(^'+iI^') = poi^t+i) Bayes' theorem, this result exactly matches Eq. [33] arrived at 
through thermodynamic reasoning. 

Moreover, several steps can be concatenated to give 

Eto^ ds, ^ V(T\xo,f3)V{xo) /pojxo) 

ViT\xs,-f3)Vixs)/poixsy ^ ^ 
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The above equation can therefore be viewed as a statistical basis for the second law of ther- 
modynamics. It is physically motivated by observing that the entropy increase is attributed 
to a combination of environmental entropy changes, dQext-, and information-like entropy 
changes, 

''p(x,)Z[-/3,x,+i]/po(x.) ^ ^ 



The above term implies an extra contribution to the total entropy change beyond Eq. 
To understand this contribution, consider a simple one-state system, to be transformed into a 
two-state system through the (unbiased) transition probability V = j5o(xj+i) = 1/2. 

The presence of the normalization in Eq. |33]leads to an additional factor of — In Z[xj]/po(3;i+i) 
— In 2 in the entropy. However, this term is canceled by the entropy of the resulting state, 
"P (xj+i) = 1/2, so that the total entropy change for this process is zero. If, instead, we 
perform a bit-set operation by going in the opposite direction, the entropy will increase if 
V {xij^i) is any non- uniform state - corresponding to our loss of information on discarding 
the bit. A term of this form is exactly what we should have expected when writing down 
Eq. [321 Applying this equation to a situation where a particle is added to the system, we find 
that the re-normalization will physically compensate for the expansion of phase space which 
leads to a difference in entropy measures. Mathematically, this implies that the "default" 
measure ^0(2^1+1) is replaced by the re-normalized measure pQ{xi+i)Zl^{xi+i) / Zx,r{xi) if we 
have information about the previous state when calculating the entropy at state i + 1. 

Equation HT] is connected to the likelihood definition (j44]). Defining i? as a "forward" 
process, starting from V (xq) and employing = {G, /3i,A:/2}, and A as a "reverse" process, 
employing \a = {G, — /3j,fc/2}. Because 



V{T\B) _ V{T\xo,B)V{xo\B) 



V{T\A) V{T\xs,A)V{xs\A) 
and the end-point distributions V (xq) , V {xs) have been pre-determined, we can write Eq. 



as 



A^tot = {Ia^b)b (49) 



which implies 



A^tot > 0. 



This completes the connection between the physical entropy production defined in Sec. [2] 
and fluctuation theorems of the form flUl) . 

However, the entropy production has not yet been connected to the caliber functional 
(Eq. [35]) . To do this, we decompose Eq. |36]into the form of Eq. [35] as 

A^tot = a*^-(Jr (50) 
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by defining 



(5^ciJ^/3/2)^ + (lnZ[-/3])^ + H 



s- 



(51) 



Although this definition is similar to the caliber, it is not an information entropy, since the 
averages are taken with respect to the forward probability distribution, V {T\xo, f3)V (xq). 
The choice of the forward direction corresponds to the direction in which information prop- 
agates [37]. It determines the target distribution for taking the divergence (Eq. |49|) . 

3.3 Jarzynski's Equality 

In the special case where a stationary distribution is known during each time-propagation 
step, a set of useful equahties can be derived simply from the re-weighting equation 



The first average represents a single-time average over the steady state distribution used to 
propagate the system during the transition T — )■ T-|- 1. The second average is a path average 
over stochastic trajectories beginning in the steady-state at time zero. 

The proof mj is by recursion from the property of the stationary distribution under time- 
propagation. 



These relations will work for any stationary distribution of the transition probability. Equal- 
ity of the starting and ending temperatures is not required. 

4 Connecting the First and Second Laws 

In order to give concrete examples of Eqns. |38] and Eq. |33l we must choose a set of coarse 
variables of interest ((/)) and follow their time-evolution, {df{t)). Most mesoscopic models 
contain hydrodynamic equations of motion for the solution density. A rigorous route [53J 
to their derivation is by forming suitable integrals of the probability distribution function 
appearing inside the Liouville equation, and much of the early literature on nonequilibrium 
problems is focused on this derivation. In the maximum transition entropy context, the 
resulting equations describe the propagation of a state of knowledge forward in time us- 
ing an exact equation of motion. The exact internal and external forces on the system are 
required using this route, and the projector-operator formalism is used to add the uncer- 
tainty introduced by mixing processes occurring below the size and time-scale of the density 
function. 




(52) 




(53) 
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In the stochastic formahsm developed here, the Liouville equation has been replaced with 
the Klein-Kramers and Smoluchowski equations. In addition to convection, these equations 
also describe diffusion of probability that occurs because of loss of information during mixing 
processes. Summing the Fokker-Planck equation for Eq. |22]over particles of each species type, 
a, and integrating over coordinates other than that of a single, distinguished molecule (using 
the relations from Ref. [53]) gives the one-particle evolution equation 

^^^^ = -V. (Z^(/3Fp(r, t) - V.p(r, t))) . (54) 

In this process, the force and diffusion coefficients have become averages over the probability 
distributions of the molecules of each type - i.e. a potential of mean force. Here, D has been 
substituted for Cx/2. 

The form of equation [M] corresponds exactly to the usual diffusion equation used for the 
continuum formulation of electrodiffusion equations [51], without the necessity of considering 
the damping or particle radius and Stokes-Einstein relation. 

In the one-particle approximation, there is an information loss associated with integrating 
over the full N-particle distribution, p{x,p,t). Because less information is used to propagate 
the system, there is a start-up entropy production of 'Hnia.x[p{x,p, t)\pa{r, t)]—'H[p{x,p, t)] > 
that is due to a loss of useful work that may be extracted from the system without this 
information. Time-evolution followed by discarding information on particle correlations then 
transforms Eq. [33] into 

dS, = •H^ax[p(t + e)] - •H^ax[p(t)] - dlj(3 > 0. (55) 

For the Fokker-Planck equation of Brownian motion (15^ . the only information used to 
propagate the distribution is the density at each time-step, and the maximum entropy dis- 
tribution is the momentum-free distribution with known average spatial density and average 
energy. [5l] In the special case where the energy is a local function of the density, this leads 
to the well-known local equilibrium theory [38]. 

The momentum and non-diffusive conservation equations may also be derived from the 
full Langevin equation fl2U]) in a manner similar to that shown in Ref. [S2]- Although more 
complex, these hydrodynamic systems are amenable to the analysis given above. The en- 
tropy production will be a combination of heat production and information loss. If the 
system moves between two steady-states, the entropy production will be given simply by 
the end-point information entropy difference plus the integral of the evolved heat divided by 
the temperature of the external heat reservoir. These quantities are both linked to informa- 
tion loss because the microscopic details of the energy exchange have been replaced by less 
informative probability distributions over the states of both systems. 

4.1 A Simple Application 

For a numerical calculation, we may turn to the one- dimensional example of an optically 
trapped bead or an atomic- force microscope pulling experiment. Assuming very fast relax- 
ation of the momentum of the pulling coordinate over a potential energy surface, U, it is 
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appropriate to use Eq. [22] with the average velocity set at zero. In the case of an optical trap 
with center Xo{t) and force F{x,t) = —k{x — Xo(t)), we can employ a change of coordinates 
to y(t) = x(t) — Xo(t) (with Xo(t) = v(t)) so that during each time-step, the position and 
trap center are updated to give 

ye = -v{t)e - -ye + Cl'^dW, 7"' = PCJ2 (56) 
7 

The constraint on x'^e/Cq, determines the rate at which the bead is allowed to dissipate 
energy into the solution, while the energy constraint, /3, acts as a driving force for net energy 
exchange. 

In the Brownian case, the system cannot distinguish between internal and external forces, 
so that Eq. [21] shows energy decrease as work from all applied (assumed internal) forces is 
dissipated into the surroundings as heat at each time-step. Reversing our sign convention to 
treat the harmonic trap as external, the work done on the bath through the system is 

{dWF\yi) = {Fxe\yi) = -{Kyxe\yi) (57) 

= ^(ny'-l/(3)e, 

7 

Although we are not including this term in our analysis, if a potential had been present for 
the particle, at each step energy 

{dWu\y^) = (-^xe\y,\ (58) 



dx 



(59) 



would also have been converted into heat from the bath system's internal potential, 

The entropy increase of Eq. [33] (Eq. [16]) is formally a path average, and its evaluation 
requires specifying an initial state and a driving protocol. As in Ref. we may choose to 
follow several velocity programs starting from a steady-state at constant pulling force. 

pUv) oc e-^(^+^''/«)' (60) 

The 'housekeeping heat' dissipated by the bath's removal of the bead's momentum at each 
time-step leads to a steady-state dissipation of {dWp)/dt = •yv'^. After a sufficiently long 
time, the instantaneous information entropy will reach its steady-state value, 1-L[P,k] = 
(1 — In f^)/2. Note that the entropy of the steady-state distribution is well-defined because it 
is invariant to the change of coordinates x ^ y. For constant pulling force and temperature, 
this expression says that over long time-periods the integral of the information entropy change 
will be zero, and the total entropy increase will be due completely to dissipated work. 



■^A pulling potential of mean force is also commonly used for U, showing that this term can be charac- 
terized as a coupling to an external thermostatic system. 
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During intermediate time-periods, however, the entropy increase will be a combination 
of changes in the position distribution plus the dissipated work. 



dS, = dn, - (3{dE)^ = ^- In l^^l^^ + PdWFiy,+,; 1/^)) • (61) 

Because the dynamics is Markovian, the average dissipated work can be easily calculated 
from the distribution at each time-step and the total of each type of work will be a sum of 
one-step stochastic integrals (Eq. |57j). Assuming the distribution is Gaussian at a starting 
time and Fourier-transforming Eq. [60] gives a Gaussian distribution at all future times for 
all driving protocols, whose mean (/i) and variance {w) are solutions of first-order ordinary 
differential equations. 

^ = -(. + ^), ^ = 2D(l-/3.^) 

dt 7 dt 

The information entropy and work follow 

dV. „ . 1 „ , dWp 1^ , , 2 \ 1 X 

^ = fl(--^K), — = -W^= + U.)-i). 

Using X + > 2, we can easily verify 

{dSi) = D/3\'^i2^ + DPk [i3kw + (Pkw)-^ - 2] > 0. (62) 

Using numerical integration, we have plotted (Fig. [1]) the rate of entropy production vs. time 
for two hypothetical driving protocols. The units in the figure are the same as in Ref. [3], 
and their third protocol (followed by its time-reversal starting at 0.28 s) has been used for 
the upper two sets of panels. Because the variance of the distribution only responds to 
changes in diffusion constant, temperature or driving force, we have varied k in the second 
set of calculations. The information entropy rate gain goes to zero and the heat production 
becomes constant at the onset of the eventual steady-state. As shown by the heat production 
during compression from k = 3 to 5 pN//im, excess heat production is required whenever the 
information entropy decreases. When the distribution expands, heat production decreases 
as the mean begins to lag behind the trap center and the distribution expands. This is 
counter-balanced by an increase in information entropy, leading to net dissipation. 

We note that a large amount of additional complexity can be added to this model by 
adding information about the variables here treated as 'external' to the description of the 
dynamics. If local variations in the fluid velocity or temperature were included, then the 
dynamics would have to specify the equations of motion for these fields. The final entropy 
increase may then be more or less than this result because these degrees of freedom may be 
responsible for additional heat production, but more information on the fluid state has been 
included, leading to decreased information loss. 
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Figure 1: Calculated entropy production during a transition between steady-states. The left 
set of panels show the imposed velocity, f , solid line; force constant, k, long dashed line 
on the left scale; and response of the mean, n, on the right vs time (s). The right set of 
panels show the decomposition of the entropy increments (Eq. [61], pN-/im) into heat (solid) 
and information gain/loss (long-dashed). Whenever the information entropy decreases, an 
equal or larger amount of heat is produced so that the total (Eq. [221 short-dashed) is always 
positive. For the upper two sets of panels, the force constant was held constant at k = 4.9 
pN/fim so no change in information entropy occurs. For the lower two, the distribution 
is compressed, then broadened by changing k, between 3 and 5 using a cubic interpolation 
lasting 80 ms. 
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5 Conclusions 



In this paper, we have given a generahzation of the theory for driven, irreversible processes. 
A set of transition probabihties defines the evolution equation for the system of interest. A 
special simplification is the case of Langevin and Brownian motion, which can be recovered 
as limits of a constrained action integral approach. Non-anticipating stochastic trajectories 
for classical particle and field motion can be cast in this form. The action functional in- 
terpretation gives a physical method for defining conserved quantities and the energy cost 
associated with transfers of these quantities from an external environment or experimental 
apparatus. 

Deterministic dynamics is recovered from the Langevin equation when the deviation of 
the action functional is strongly constrained to zero (Sec. [T]). In this limit, the external 
forces which appeared as statistical in the stochastic approach become mechanical. Because 
both limits appear in this derivation, the fluctuation-dissipation theorems derived as Gibbs 
relations from Eq.[37]are applicable in the case of both thermal and mechanical driving forces. 
These equations are completely general in the sense that they apply not only arbitrarily far 
from equilibrium, but also during transient processes which do not possess a steady state. 

A particularly useful aspect of this approach is that it directly connects multiple length 
and time-scales. The formulation of the equations has been in terms of particle motion, 
but coarse-grained relations are easy to define as appropriate ensemble averages over these 
motions. Examples of such averages include centers of mass for polymer units or average 
density and velocity fields. The coarse equations of motion will then lead to polymer coarse- 
graining models[55] or non-local hydrodynamic models. [SSI HI ES] For the time-evolution 
of average quantities, we expect the thermodynamic limit argument [37j to apply when the 
number of averaged degrees of freedom is large so that the path realized by the system 
under a given set of constraints will fall arbitrarily close to the maximum entropy solution 
an overwhelming majority of the time. The present work is therefore a suitable foundation 
for the theory and analysis of nonequilibrium molecular dynamics. 

Applications to simplified, standard examples such as circuit theory are easily accom- 
plished. The Joule heating of a resistor, for example, can be seen from Eq. [28] as fundamen- 
tally arising from the difference between the velocity added to each ion individually vs. the 
usable energy in the average ion velocity. Because the energy added to the system in driving 
the ions is not expressible in terms of the average velocity alone, spreading in the distribu- 
tion of ion velocities becomes heat. The same remarks follow for driven convective transport, 
where a spreading in the distribution of forward fluid momentum leads to increases in the 
local temperature (Eq. [2T|) . 

Connections of this theory to the formal structure of maximum entropy thermodynamics 
and Bayesian inference have been elaborated upon in Ref.[22j These connections allow the 
definition of thermodynamic cycles expressing differences between driving protocols using 
the same free energy techniques commonly employed in the equilibrium theory. Some ex- 
amples have already appeared in the literature for path re-weighting|57[ [58] . It is expected 
that expression in terms of thermodynamic cycles will greatly simplify the derivation and 
interpretation of these studies. 
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We have identified a new generalization of tlie second law for irreversible processes. A 
traditional analysis shows that the total entropy increase (Eq. 1551) is dependent on the 
details of system dynamics and exchange of conserved quantities with an external system. 
Connecting this with the fluctuation theorem (Eq. H6l) gives a microscopic form for the 
second law of thermodynamics. The physical device of tracking work performed on individual 
particles as well as external reversible work sources allows us to track the flow of each type 
of work (and heat) through the system. Because these changes come directly from the forces 
on each degree of freedom, this analysis does not depend arbitrary decompositions of energy 
functions or definitions of steady-states. 

From an informational perspective, entropy increase comes about from discarding in- 
formation and/or from the information loss associated with coupling to external reservoirs. 
This is distinguished from the entropy production functional of local equilibrium theory in 
that the entropy functionals developed here include long-range correlations and are not nec- 
essarily extensive. |59] It is a simple matter to define more complicated baths, for example 
affecting only the average temperature in a given area for imposing thermal gradients. It 
should be noted that the analysis in Sec. |2] showed that increasing the number of control- 
lable variables decreases the number of degrees of freedom associated with heat production. 
Molecular insertion and deletion operations will aid in generalizing this approach to include 
imposed chemical potential (insertion force) as well as particle flux boundary conditions, but 
have not been considered here. 
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A Analytical Calculation of Stochastic Integrals 

Despite the wealth of literature on the Langevin and Wiener processes, the procedure for 
calculating expectations of time- integrals given in standard references such as Gardner [5U] 
and Risken[3^ remains complicated. Because Stratonovich integrals appear prominently in 
the present paper, often usurping the role of thermodynamic potentials, we present here two 
alternative methods. Both rely on replacing expressions to be evaluated at the midpoint of 
a time-step with the first-order expansion, f{x) ~ f{x) + 

Using Eq. [18] with Yq = x to find the energy change, we expand the average velocity 
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about the mid-point, 



X 



Xi + e/2M-^p 
= Xi + \M-^ (Pe - ^CYP - f Ci/3o + C^/^ W 
= (/ + '-fM-^Cy^ \xi + f M-^(F - CY~P/2) + \M-^C^'^dW 

= {I- '-fM-^C)xi + IM-\F - CYf3/2) + \M'^C^/^dW + 0{e^/^) 
= i;i + fM-V + 0(e3/2), 

where p' is computed using only quantities at the time-step i. Multiplying this with 
from Eq. [201 we get 

dio = ±{p - F)e = \dW'^C^/^M-^C^'^dW + dW^C^'^x, - ^P^Y'^Cx, + 0{e^/^). (68) 

Since M^^/^C^/^dW is normally distributed with mean zero and variance-covariance matrix 
M~^/^CM~^/^e, dio has a noncentral distribution with expectation 



(63) 
(64) 

(65) 

(66) 
(67) 

5A 
5x{t) 



(69) 
If, 



For a single constraint, Y' = x, we find a definition of the kinetic temperature, Eq. 
in addition, we include a constant pulling force, /3i = —A, we find 

(dlo) = \ [Tr(M-iC) - /3o(A//3ol - x,fC{\/P^l - x,) + A(A//3ol - x^fCl] . (70) 

What emerges is a kinetic temperature with respect to the terminal velocity, A//3o, as well 
as a heating term. 

This method can also be used to prove Eq. |571 starting from the expansion 



y = yt + e2//2 
= y,-i(t;(t)e + ^ye-Ci/W 
= (l + f^)-i(l/,-f + iCi/2rfW) 
= (1 - f^)t/, - f + \C^/^dW + 0{e^'^) 



(71) 
(72) 
(73) 
(74) 



As discussed in the text, these integrals should also result from differentiating a partition 
function (Eq. [TTl) . We present an extended derivation of the main results of this approach 
here. Both Langevin and Brownian equations can be derived as appropriate limits of the 
constraints 
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'qP/2 + g{q)\/2 
v(5/2 + h{q)\/2 



(75) 
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where \g{q) and \h{q) introduce external forces. Next, we make the one-half step substitu- 
tions, 

'Mifiq)+F{q)qe/2-v 
M{q-v - ve/2) 



rdAl 




dq 
dA 




- dv - 





9{q) ^ giq) + G{q)qt/2, 



(76) 



using the force-per-mass, /, the appropriate derivative matrices {F, G, H}jj = d{f, g, h}i/dqj, 
and defining Jg = MGgM = C-^/2, Jp = MGpM = MC-^M/2. Factoring r/, gives a nor- 
mal distribution for [?), q\^e with penalty matrix (inverse of the variance-covariance matrix, 
keeping terms below 0(e)) 



Jp + eM/3/4 



e(MGA/4 - F^Jp - Jg)/2 



e{G^M\/A - JpF - Jg)/2 Jg - e{MH\ + MF[5)/A 
and mean (to first order in e) 



(77) 
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V 



e. 



(7J 



These expressions are in accord with Eqns. [20] and | 

The residual terms contribute to form the transition free energy functional (again to 
order e). 



J" 



i [In M - (/3t; + XgfCi^v + Xg)e/A 



{f3Mf + \MhfCg{f3Mf + \Mh)e/4\ . 



(79) 
(80) 



Note that the Fokker-Planck equation can be used to prove that the Boltzmann distri- 
bution is stationary under either the Langevin {Cg — 0) or Brownian (C — i- 0) limits, but 
not both. For the Langevin limit, it can be checked that the derivative of this equation with 
respect to /3/2 gives Eq. [TOl For the Brownian limit, we find again Eq. [571 These rely on 
the following expansion for the derivative of the log-determinant term 



din \eP\ 



da 



Tr (eP) 



da 



eP = Po + ePi + Oie^), Pq 



Ja 



[eP)-' = P,'-eP,'P^P,' + 0{e'). 



Since should contain a prefactor of e, the second term is usually unimportant, so that 



din \eP\ 
da 



Tr 2 



-1 



d{eP) 
da 



^1) 
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